Qom Province
MultiCNKG: Integrating Cognitive Neuroscience, Gene, and Disease Knowledge Graphs Using Large Language Models
Sarabadani, Ali, Fard, Kheirolah Rahsepar
The advent of large language models (LLMs) has revolutionized the integration of knowledge graphs (KGs) in biomedical and cognitive sciences, overcoming limitations in traditional machine learning methods for capturing intricate semantic links among genes, diseases, and cognitive processes. We introduce MultiCNKG, an innovative framework that merges three key knowledge sources: the Cognitive Neuroscience Knowledge Graph (CNKG) with 2.9K nodes and 4.3K edges across 9 node types and 20 edge types; Gene Ontology (GO) featuring 43K nodes and 75K edges in 3 node types and 4 edge types; and Disease Ontology (DO) comprising 11.2K nodes and 8.8K edges with 1 node type and 2 edge types. Leveraging LLMs like GPT-4, we conduct entity alignment, semantic similarity computation, and graph augmentation to create a cohesive KG that interconnects genetic mechanisms, neurological disorders, and cognitive functions. The resulting MultiCNKG encompasses 6.9K nodes across 5 types (e.g., Genes, Diseases, Cognitive Processes) and 11.3K edges spanning 7 types (e.g., Causes, Associated with, Regulates), facilitating a multi-layered view from molecular to behavioral domains. Assessments using metrics such as precision (85.20%), recall (87.30%), coverage (92.18%), graph consistency (82.50%), novelty detection (40.28%), and expert validation (89.50%) affirm its robustness and coherence. Link prediction evaluations with models like TransE (MR: 391, MRR: 0.411) and RotatE (MR: 263, MRR: 0.395) show competitive performance against benchmarks like FB15k-237 and WN18RR. This KG advances applications in personalized medicine, cognitive disorder diagnostics, and hypothesis formulation in cognitive neuroscience.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Iran > Qom Province > Qom (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
DKG-LLM : A Framework for Medical Diagnosis and Personalized Treatment Recommendations via Dynamic Knowledge Graph and Large Language Model Integration
Sarabadani, Ali, Shamami, Maryam Abdollahi, Sadeghsalehi, Hamidreza, Asadi, Borhan, Hesaraki, Saba
Large Language Models (LLMs) have grown exponentially since the release of ChatGPT. These models have gained attention due to their robust performance on various tasks, including language processing tasks. These models achieve understanding and comprehension of tasks by training billions of parameters. The development of these models is a transformative force in enhancing natural language understanding and has taken a significant step towards artificial general intelligence (AGI). In this study, we aim to present the DKG-LLM framework. The DKG-LLM framework introduces a groundbreaking approach to medical diagnosis and personalized treatment recommendations by integrating a dynamic knowledge graph (DKG) with the Grok 3 large language model. Using the Adaptive Semantic Fusion Algorithm (ASFA), heterogeneous medical data (including clinical reports and PubMed articles) and patient records dynamically generate a knowledge graph consisting of 15,964 nodes in 13 distinct types (e.g., diseases, symptoms, treatments, patient profiles) and 127,392 edges in 26 relationship types (e.g., causal, therapeutic, association). ASFA utilizes advanced probabilistic models, Bayesian inference, and graph optimization to extract semantic information, dynamically updating the graph with approximately 150 new nodes and edges in each data category while maintaining scalability with up to 987,654 edges. Real-world datasets, including MIMIC-III and PubMed, were utilized to evaluate the proposed architecture. The evaluation results show that DKG-LLM achieves a diagnostic accuracy of 84.19%. The model also has a treatment recommendation accuracy of 89.63% and a semantic coverage of 93.48%. DKG-LLM is a reliable and transformative tool that handles noisy data and complex multi-symptom diseases, along with feedback-based learning from physician input.
- Europe > United Kingdom (0.14)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- North America > United States (0.04)
- (2 more...)
- Research Report > Experimental Study (0.93)
- Research Report > New Finding (0.86)
A Soft Robotic Module with Pneumatic Actuation and Enhanced Controllability Using a Shape Memory Alloy Wire
In this paper, a compressed air-actuated soft robotic module was developed by incorporating a shape memory alloy (SMA) wire into its structure to achieve the desired bending angle with greater precision. First, a fiber-reinforced bending module with a strain-limiting layer made of polypropylene was fabricated. The SMA wire was then placed in a silicon matrix, which was used as a new strain-limiting layer. A simple closed-loop control algorithm was used to regulate the bending angle of the soft robot within its workspace. A camera was utilized to measure the angular changes in the vertical plane. Different angles, ranging from 0 to 65 degrees, were covered to evaluate the performance of the module and the bending angle control algorithm. The experimental tests demonstrate that using the SMA wire results in more precise control of bending in the vertical plane. In addition, it is possible to bend more with less working pressure. The error range was reduced from an average of 5 degrees to 2 degrees, and the rise time was reduced from an average of 19 seconds to 3 seconds.
- North America > United States > North Carolina > Mecklenburg County > Charlotte (0.14)
- Asia > Middle East > Iran > Qom Province > Qom (0.04)
FarsEval-PKBETS: A new diverse benchmark for evaluating Persian large language models
Shamsfard, Mehrnoush, Saaberi, Zahra, manesh, Mostafa Karimi, Hashemi, Seyed Mohammad Hossein, Vatankhah, Zahra, Ramezani, Motahareh, Pourazin, Niki, Zare, Tara, Azimi, Maryam, Chitsaz, Sarina, Khoraminejad, Sama, Mortazavi, Morteza Mahdavi, Chizari, Mohammad Mahdi, Maleki, Sahar, Majd, Seyed Soroush, Masumi, Mostafa, Khoeini, Sayed Ali Musavi, Mohseni, Amir, Alipour, Sogol
Research on evaluatin g and analyzing large language models (LLMs) has been extensive for high - resource languages such as English, yet their performance in languages such as Persian has received considerably less attention. This paper introduces FarsEval - PKBETS benchmark, a subset of FarsEval project for evaluat ing large language models in Persian. This benchmark consists of 4,000 questions and answers in various formats, including multiple - choice, short - answer, and descriptive responses. It covers a wide range of domains and tasks, including medicine, law, religion, Persian language, encyclopedic knowledge, human preferences, social knowledge, ethics and bias, text generation, and respecting others' rights. This benchmark incorporates linguistic, cultural, and local considera tions relevant to the Persian language and Iran. To ensure the questions are challenging for current LLMs, three models -- Llama3 - 70B, PersianMind, and Dorna -- were evaluated using this benchmark. Their average accuracy was below 50%, meaning they provided full y correct answers to fewer than half of the questions. These results indicate that current language models are still far from being able to solve this benchmark.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Oceania > Australia > Western Australia > Perth (0.04)
- (7 more...)
- Law (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Education (1.00)
Leveraging Machine Learning and Deep Learning Techniques for Improved Pathological Staging of Prostate Cancer
Ghalamkarian, Raziehsadat, Ghalamkarian, Marziehsadat, Ahmadi, MortezaAli, Ahmadi, Sayed Mohammad, Diyanat, Abolfazl
Prostate cancer (Pca) continues to be a leading cause of cancer-related mortality in men, and the limitations in precision of traditional diagnostic methods such as the Digital Rectal Exam (DRE), Prostate-Specific Antigen (PSA) testing, and biopsies underscore the critical importance of accurate staging detection in enhancing treatment outcomes and improving patient prognosis. This study leverages machine learning and deep learning approaches, along with feature selection and extraction methods, to enhance PCa pathological staging predictions using RNA sequencing data from The Cancer Genome Atlas (TCGA). Gene expression profiles from 486 tumors were analyzed using advanced algorithms, including Random Forest (RF), Logistic Regression (LR), Extreme Gradient Boosting (XGB), and Support Vector Machine (SVM). The performance of the study is measured with respect to the F1-score, as well as precision and recall, all of which are calculated as weighted averages. The results reveal that the highest test F1-score, approximately 83%, was achieved by the Random Forest algorithm, followed by Logistic Regression at 80%, while both Extreme Gradient Boosting (XGB) and Support Vector Machine (SVM) scored around 79%. Furthermore, deep learning models with data augmentation achieved an accuracy of 71. 23%, while PCA-based dimensionality reduction reached an accuracy of 69.86%. This research highlights the potential of AI-driven approaches in clinical oncology, paving the way for more reliable diagnostic tools that can ultimately improve patient outcomes.
- Europe > United Kingdom > England (0.04)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Asia > Middle East > Iran > Qom Province > Qom (0.04)
- Asia > Middle East > Iran > Isfahan Province > Isfahan (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Predicting Drive Test Results in Mobile Networks Using Optimization Techniques
Taheri, MohammadJava, Diyanat, Abolfazl, Ahmadi, MortezaAli, Nazari, Ali
Mobile network operators constantly optimize their networks to ensure superior service quality and coverage. This optimization is crucial for maintaining an optimal user experience and requires extensive data collection and analysis. One of the primary methods for gathering this data is through drive tests, where technical teams use specialized equipment to collect signal information across various regions. However, drive tests are both costly and time-consuming, and they face challenges such as traffic conditions, environmental factors, and limited access to certain areas. These constraints make it difficult to replicate drive tests under similar conditions. In this study, we propose a method that enables operators to predict received signal strength at specific locations using data from other drive test points. By reducing the need for widespread drive tests, this approach allows operators to save time and resources while still obtaining the necessary data to optimize their networks and mitigate the challenges associated with traditional drive tests.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Asia > Middle East > Iran > Qom Province > Qom (0.04)
- Asia > Indonesia > Bali (0.04)
- Information Technology > Networks (0.66)
- Telecommunications > Networks (0.48)
FarSSiBERT: A Novel Transformer-based Model for Semantic Similarity Measurement of Persian Social Networks Informal Texts
Sadjadi, Seyed Mojtaba, Rajabi, Zeinab, Rabiei, Leila, Moin, Mohammad-Shahram
One fundamental task for NLP is to determine the similarity between two texts and evaluate the extent of their likeness. The previous methods for the Persian language have low accuracy and are unable to comprehend the structure and meaning of texts effectively. Additionally, these methods primarily focus on formal texts, but in real-world applications of text processing, there is a need for robust methods that can handle colloquial texts. This requires algorithms that consider the structure and significance of words based on context, rather than just the frequency of words. The lack of a proper dataset for this task in the Persian language makes it important to develop such algorithms and construct a dataset for Persian text. This paper introduces a new transformer-based model to measure semantic similarity between Persian informal short texts from social networks. In addition, a Persian dataset named FarSSiM has been constructed for this purpose, using real data from social networks and manually annotated and verified by a linguistic expert team. The proposed model involves training a large language model using the BERT architecture from scratch. This model, called FarSSiBERT, is pre-trained on approximately 104 million Persian informal short texts from social networks, making it one of a kind in the Persian language. Moreover, a novel specialized informal language tokenizer is provided that not only performs tokenization on formal texts well but also accurately identifies tokens that other Persian tokenizers are unable to recognize. It has been demonstrated that our proposed model outperforms ParsBERT, laBSE, and multilingual BERT in the Pearson and Spearman's coefficient criteria. Additionally, the pre-trained large language model has great potential for use in other NLP tasks on colloquial text and as a tokenizer for less-known informal words.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
- Asia > Middle East > Iran > Qom Province > Qom (0.04)
Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation
Ghanbari, Alireza, Shirdel, Gholamhassan, Maleki, Farhad
Precision agriculture involves the application of advanced technologies to improve agricultural productivity, efficiency, and profitability while minimizing waste and environmental impact. Deep learning approaches enable automated decision-making for many visual tasks. However, in the agricultural domain, variability in growth stages and environmental conditions, such as weather and lighting, presents significant challenges to developing deep learning-based techniques that generalize across different conditions. The resource-intensive nature of creating extensive annotated datasets that capture these variabilities further hinders the widespread adoption of these approaches. To tackle these issues, we introduce a semi-self-supervised domain adaptation technique based on deep convolutional neural networks with a probabilistic diffusion process, requiring minimal manual data annotation. Using only three manually annotated images and a selection of video clips from wheat fields, we generated a large-scale computationally annotated dataset of image-mask pairs and a large dataset of unannotated images extracted from video frames. We developed a two-branch convolutional encoder-decoder model architecture that uses both synthesized image-mask pairs and unannotated images, enabling effective adaptation to real images. The proposed model achieved a Dice score of 80.7\% on an internal test dataset and a Dice score of 64.8\% on an external test set, composed of images from five countries and spanning 18 domains, indicating its potential to develop generalizable solutions that could encourage the wider adoption of advanced technologies in agriculture.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- Asia > Middle East > Iran > Qom Province > Qom (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Engineering Features to Improve Pass Prediction in Soccer Simulation 2D Games
Zare, Nader, Sarvmaili, Mahtab, Sayareh, Aref, Amini, Omid, Soares, Stan Matwin Amilcar
Soccer Simulation 2D (SS2D) is a simulation of a real soccer game in two dimensions. In soccer, passing behavior is an essential action for keeping the ball in possession of our team and creating goal opportunities. Similarly, for SS2D, predicting the passing behaviors of both opponents and our teammates helps manage resources and score more goals. Therefore, in this research, we have tried to address the modeling of passing behavior of soccer 2D players using Deep Neural Networks (DNN) and Random Forest (RF). We propose an embedded data extraction module that can record the decision-making of agents in an online format. Afterward, we apply four data sorting techniques for training data preparation. After, we evaluate the trained models' performance playing against 6 top teams of RoboCup 2019 that have distinctive playing strategies. Finally, we examine the importance of different feature groups on the prediction of a passing strategy. All results in each step of this work prove our suggested methodology's effectiveness and improve the performance of the pass prediction in Soccer Simulation 2D games ranging from 5\% (e.g., playing against the same team) to 10\% (e.g., playing against Robocup top teams).
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
- North America > Canada > Newfoundland and Labrador > Newfoundland > St. John's (0.04)
- (3 more...)
Noor-Ghateh: A Benchmark Dataset for Evaluating Arabic Word Segmenters in Hadith Domain
AlShuhayeb, Huda, Minaei-Bidgoli, Behrouz, Shenassa, Mohammad E., Hossayni, Sayyed-Ali
There are many complex and rich morphological subtleties in the Arabic language, which are very useful when analyzing traditional Arabic texts, especially in the historical and religious contexts, and help in understanding the meaning of the texts. Vocabulary separation means separating the word into different parts such as root and affix. In the morphological datasets, the variety of labels and the number of data samples helps to evaluate the morphological methods. In this paper, we present a benchmark data set for evaluating the methods of separating Arabic words which include about 223,690 words from the book of Sharia alIslam, which have been labeled by experts. In terms of the volume and variety of words, this dataset is superior to other existing data sets, and as far as we know, there are no Arabic Hadith Domain texts. To evaluate the dataset, we applied different methods such as Farasa, Camel, Madamira, and ALP to the dataset and we reported the annotation quality through four evaluation methods.
- Europe > Czechia > Prague (0.05)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.05)
- North America > United States > Pennsylvania (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)